Overview

Dataset statistics

Number of variables20
Number of observations13495
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory355.9 KiB
Average record size in memory27.0 B

Variable types

BOOL11
NUM8
CAT1

Warnings

df_index has unique values Unique
voiced has 5268 (39.0%) zeros Zeros
ani_story has 8402 (62.3%) zeros Zeros
ani_ero has 10486 (77.7%) zeros Zeros

Reproduction

Analysis started2020-10-29 12:30:04.159212
Analysis finished2020-10-29 12:30:16.656948
Duration12.5 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct13495
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35731.83942
Minimum0
Maximum71510
Zeros1
Zeros (%)< 0.1%
Memory size105.4 KiB
2020-10-29T13:30:16.755710image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1608.7
Q120005
median37457
Q352177.5
95-th percentile66694.3
Maximum71510
Range71510
Interquartile range (IQR)32172.5

Descriptive statistics

Standard deviation20270.15712
Coefficient of variation (CV)0.5672855763
Kurtosis-1.06056629
Mean35731.83942
Median Absolute Deviation (MAD)15942
Skewness-0.1718357802
Sum482201173
Variance410879269.6
MonotocityStrictly increasing
2020-10-29T13:30:16.897305image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
368621< 0.1%
 
703591< 0.1%
 
540471< 0.1%
 
233261< 0.1%
 
519961< 0.1%
 
622351< 0.1%
 
601841< 0.1%
 
356041< 0.1%
 
417451< 0.1%
 
437921< 0.1%
 
Other values (13485)1348599.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
715101< 0.1%
 
715051< 0.1%
 
715001< 0.1%
 
714421< 0.1%
 
714401< 0.1%
 

type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
0
11124 
2
1530 
1
 
841
ValueCountFrequency (%) 
01112482.4%
 
2153011.3%
 
18416.2%
 
2020-10-29T13:30:17.049896image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-29T13:30:17.130715image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:17.237426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

patch
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
0
10998 
1
2497 
ValueCountFrequency (%) 
01099881.5%
 
1249718.5%
 
2020-10-29T13:30:17.314221image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

freeware
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
0
8724 
1
4771 
ValueCountFrequency (%) 
0872464.6%
 
1477135.4%
 
2020-10-29T13:30:17.363060image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

doujin
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
0
11842 
1
1653 
ValueCountFrequency (%) 
01184287.8%
 
1165312.2%
 
2020-10-29T13:30:17.411962image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

voiced
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.111004076
Minimum0
Maximum4
Zeros5268
Zeros (%)39.0%
Memory size13.2 KiB
2020-10-29T13:30:17.487756image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q34
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.881113686
Coefficient of variation (CV)0.8910990307
Kurtosis-1.899453364
Mean2.111004076
Median Absolute Deviation (MAD)1
Skewness-0.09784837754
Sum28488
Variance3.538588698
MonotocityNot monotonic
2020-10-29T13:30:17.590483image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
4620746.0%
 
0526839.0%
 
111988.9%
 
38186.1%
 
24< 0.1%
 
ValueCountFrequency (%) 
0526839.0%
 
111988.9%
 
24< 0.1%
 
38186.1%
 
4620746.0%
 
ValueCountFrequency (%) 
4620746.0%
 
38186.1%
 
24< 0.1%
 
111988.9%
 
0526839.0%
 

ani_story
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6117821415
Minimum0
Maximum4
Zeros8402
Zeros (%)62.3%
Memory size13.2 KiB
2020-10-29T13:30:17.708166image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9043939739
Coefficient of variation (CV)1.478294171
Kurtosis1.120732806
Mean0.6117821415
Median Absolute Deviation (MAD)0
Skewness1.365172939
Sum8256
Variance0.81792846
MonotocityNot monotonic
2020-10-29T13:30:17.814902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
0840262.3%
 
1256619.0%
 
2200814.9%
 
34023.0%
 
41170.9%
 
ValueCountFrequency (%) 
0840262.3%
 
1256619.0%
 
2200814.9%
 
34023.0%
 
41170.9%
 
ValueCountFrequency (%) 
41170.9%
 
34023.0%
 
2200814.9%
 
1256619.0%
 
0840262.3%
 

ani_ero
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2971470915
Minimum0
Maximum4
Zeros10486
Zeros (%)77.7%
Memory size13.2 KiB
2020-10-29T13:30:17.942538image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6480146146
Coefficient of variation (CV)2.180787338
Kurtosis8.745411176
Mean0.2971470915
Median Absolute Deviation (MAD)0
Skewness2.743001357
Sum4010
Variance0.4199229408
MonotocityNot monotonic
2020-10-29T13:30:18.045258image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
01048677.7%
 
1236717.5%
 
23472.6%
 
32311.7%
 
4640.5%
 
ValueCountFrequency (%) 
01048677.7%
 
1236717.5%
 
23472.6%
 
32311.7%
 
4640.5%
 
ValueCountFrequency (%) 
4640.5%
 
32311.7%
 
23472.6%
 
1236717.5%
 
01048677.7%
 

uncensored
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
0
12882 
1
 
613
ValueCountFrequency (%) 
01288295.5%
 
16134.5%
 
2020-10-29T13:30:18.137021image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

engine
Real number (ℝ≥0)

Distinct92
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.9442016
Minimum1
Maximum184
Zeros0
Zeros (%)0.0%
Memory size13.2 KiB
2020-10-29T13:30:18.236721image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile58
Q1178
median178
Q3178
95-th percentile178
Maximum184
Range183
Interquartile range (IQR)0

Descriptive statistics

Standard deviation40.3272178
Coefficient of variation (CV)0.2553257252
Kurtosis2.838293411
Mean157.9442016
Median Absolute Deviation (MAD)0
Skewness-1.973254274
Sum2131457
Variance1626.284496
MonotocityNot monotonic
2020-10-29T13:30:18.379410image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1781012775.0%
 
857895.8%
 
1267785.8%
 
311881.4%
 
391451.1%
 
1661060.8%
 
1251000.7%
 
153920.7%
 
106910.7%
 
30910.7%
 
Other values (82)9887.3%
 
ValueCountFrequency (%) 
13< 0.1%
 
2380.3%
 
32< 0.1%
 
980.1%
 
132< 0.1%
 
ValueCountFrequency (%) 
1843< 0.1%
 
1822< 0.1%
 
1781012775.0%
 
1772< 0.1%
 
176440.3%
 

l_steam
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
12868 
0
 
627
ValueCountFrequency (%) 
11286895.4%
 
06274.6%
 
2020-10-29T13:30:18.483291image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

l_digiket
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
12867 
0
 
628
ValueCountFrequency (%) 
11286795.3%
 
06284.7%
 
2020-10-29T13:30:18.533396image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

l_melon
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
13491 
0
 
4
ValueCountFrequency (%) 
113491> 99.9%
 
04< 0.1%
 
2020-10-29T13:30:18.676429image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

l_mg
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
13265 
0
 
230
ValueCountFrequency (%) 
11326598.3%
 
02301.7%
 
2020-10-29T13:30:18.725405image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

l_getchu
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
12859 
0
 
636
ValueCountFrequency (%) 
11285995.3%
 
06364.7%
 
2020-10-29T13:30:18.776451image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
13400 
0
 
95
ValueCountFrequency (%) 
11340099.3%
 
0950.7%
 
2020-10-29T13:30:18.826442image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

l_melonjp
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
1
13366 
0
 
129
ValueCountFrequency (%) 
11336699.0%
 
01291.0%
 
2020-10-29T13:30:18.875433image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

lang
Real number (ℝ≥0)

Distinct31
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.50633568
Minimum0
Maximum39
Zeros2
Zeros (%)< 0.1%
Memory size13.2 KiB
2020-10-29T13:30:18.967275image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q117
median17
Q317
95-th percentile37
Maximum39
Range39
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.102130837
Coefficient of variation (CV)0.4628113493
Kurtosis1.689457476
Mean17.50633568
Median Absolute Deviation (MAD)0
Skewness1.15240838
Sum236248
Variance65.64452411
MonotocityNot monotonic
2020-10-29T13:30:19.095427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
17936569.4%
 
6204915.2%
 
376855.1%
 
395814.3%
 
293212.4%
 
81381.0%
 
18940.7%
 
10670.5%
 
26380.3%
 
16290.2%
 
Other values (21)1280.9%
 
ValueCountFrequency (%) 
02< 0.1%
 
11< 0.1%
 
21< 0.1%
 
34< 0.1%
 
5290.2%
 
ValueCountFrequency (%) 
395814.3%
 
38260.2%
 
376855.1%
 
355< 0.1%
 
342< 0.1%
 

month
Real number (ℝ≥0)

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.676546869
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size13.2 KiB
2020-10-29T13:30:19.223464image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.433752628
Coefficient of variation (CV)0.514300685
Kurtosis-1.210126169
Mean6.676546869
Median Absolute Deviation (MAD)3
Skewness-0.03526127363
Sum90100
Variance11.79065711
MonotocityNot monotonic
2020-10-29T13:30:19.328285image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
1213029.6%
 
412089.0%
 
1011938.8%
 
711828.8%
 
911538.5%
 
811398.4%
 
311328.4%
 
1110768.0%
 
610667.9%
 
510587.8%
 
Other values (2)198614.7%
 
ValueCountFrequency (%) 
19397.0%
 
210477.8%
 
311328.4%
 
412089.0%
 
510587.8%
 
ValueCountFrequency (%) 
1213029.6%
 
1110768.0%
 
1011938.8%
 
911538.5%
 
811398.4%
 

c_popularity
Real number (ℝ≥0)

Distinct52
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.040311226
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size13.2 KiB
2020-10-29T13:30:19.466004image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q37
95-th percentile40
Maximum100
Range99
Interquartile range (IQR)6

Descriptive statistics

Standard deviation14.79518304
Coefficient of variation (CV)1.840125665
Kurtosis12.5356588
Mean8.040311226
Median Absolute Deviation (MAD)1
Skewness3.359126292
Sum108504
Variance218.8974411
MonotocityNot monotonic
2020-10-29T13:30:19.614511image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1527739.1%
 
2196514.6%
 
311118.2%
 
48276.1%
 
54393.3%
 
63522.6%
 
73152.3%
 
92722.0%
 
82481.8%
 
181951.4%
 
Other values (42)249418.5%
 
ValueCountFrequency (%) 
1527739.1%
 
2196514.6%
 
311118.2%
 
48276.1%
 
54393.3%
 
ValueCountFrequency (%) 
100360.3%
 
87370.3%
 
85240.2%
 
80890.7%
 
77280.2%
 

Interactions

2020-10-29T13:30:05.419357image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:05.591439image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:05.766937image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:05.925534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.089096image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.238717image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.403428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.564236image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.725804image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:06.886408image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.043426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.203515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.362393image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.519421image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.681435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.828473image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:07.978458image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:08.134438image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:08.294459image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:08.638404image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:08.798585image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:08.956409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.128425image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.287534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.446558image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.604542image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.764038image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:09.933668image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.095534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.254456image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.413499image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.567028image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.752044image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:10.926601image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.092136image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.289991image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.475493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.643045image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.815809image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:11.961787image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.118724image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.304832image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.486977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.647278image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.817844image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:12.976179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.140740image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.291911image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.440295image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.579906image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.728180image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:13.962772image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.106996image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.248370image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.389426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.520307image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.652920image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.796561image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:14.941444image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:15.083506image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:15.229426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:15.370451image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:15.522396image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:15.660444image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-10-29T13:30:19.775082image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-29T13:30:20.099175image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-29T13:30:20.408113image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-29T13:30:20.733412image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-29T13:30:15.935388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-29T13:30:16.463466image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

df_indextypepatchfreewaredoujinvoicedani_storyani_erouncensoredenginel_steaml_digiketl_melonl_mgl_getchul_toranoanal_melonjplangmonthc_popularity
0000000000178111111137104
1100004310178111111117265
22000031117411111111784
33000011107411111116615
4400001110178111101117615
5500004310178111111117365
6600004300178111111117965
7700004210178111111117242
8800004210178111111117442
990000111174111111117715

Last rows

df_indextypepatchfreewaredoujinvoicedani_storyani_erouncensoredenginel_steaml_digiketl_melonl_mgl_getchul_toranoanal_melonjplangmonthc_popularity
134857141000111200106111111117103
1348671426000040001781011111654
1348771427000040001781011111664
13488714280000411017810111111743
13489714290000411017810111111746
134907144000004000211111116101
1349171442000040001781111011171241
134927150000111200106111111117103
13493715050000400017811111116101
134947151001100000178111111117104